Modality, presentation, domain and training effects in statistical learning

您所在的位置:网站首页 modality effect Modality, presentation, domain and training effects in statistical learning

Modality, presentation, domain and training effects in statistical learning

2024-07-15 19:33| 来源: 网络整理| 查看: 265

Sci Rep. 2022; 12: 20878. Published online 2022 Dec 3. doi: 10.1038/s41598-022-24951-7PMCID: PMC9719496PMID: 36463280Modality, presentation, domain and training effects in statistical learningKrisztina Sára Lukics1,2 and Ágnes Lukács1,2Krisztina Sára Lukics

1Department of Cognitive Science, Budapest University of Technology and Economics, Műegyetem rkp. 3., H-1111 Budapest, Hungary

2MTA-BME Momentum Language Acquisition Research Group, Eötvös Loránd Research Network (ELKH), Budapest, Hungary

Find articles by Krisztina Sára LukicsÁgnes Lukács

1Department of Cognitive Science, Budapest University of Technology and Economics, Műegyetem rkp. 3., H-1111 Budapest, Hungary

2MTA-BME Momentum Language Acquisition Research Group, Eötvös Loránd Research Network (ELKH), Budapest, Hungary

Find articles by Ágnes LukácsAuthor information Article notes Copyright and License information PMC Disclaimer1Department of Cognitive Science, Budapest University of Technology and Economics, Műegyetem rkp. 3., H-1111 Budapest, Hungary 2MTA-BME Momentum Language Acquisition Research Group, Eötvös Loránd Research Network (ELKH), Budapest, Hungary Krisztina Sára Lukics, Email: [email protected] author.Received 2022 May 5; Accepted 2022 Nov 22.Copyright © The Author(s) 2022Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.Associated DataSupplementary MaterialsSupplementary Information 1.41598_2022_24951_MOESM1_ESM.docx (15K)GUID: 4B876376-C7C6-427A-AE0D-2B3B51E87CFESupplementary Information 2.41598_2022_24951_MOESM2_ESM.docx (224K)GUID: DCA0EC08-972B-400E-9D62-75EBD10A7C67Supplementary Information 3.41598_2022_24951_MOESM3_ESM.xlsx (76K)GUID: 05D73E73-2DE5-49D3-8372-CFFE6ACA5030Data Availability Statement

Study materials and data of the experiment are available at https://osf.io/hzg7w/?view_only=639eb1b8b8dd4325a7a5ce25c08cee9b.

Abstract

While several studies suggest that the nature and properties of the input have significant effects on statistical learning, they have rarely been investigated systematically. In order to understand how input characteristics and their interactions impact statistical learning, we explored the effects of modality (auditory vs. visual), presentation type (serial vs. simultaneous), domain (linguistic vs. non-linguistic), and training type (random, starting small, starting big) on artificial grammar learning in young adults (N = 360). With serial presentation of stimuli, learning was more effective in the auditory than in the visual modality. However, with simultaneous presentation of visual and serial presentation of auditory stimuli, the modality effect was not present. We found a significant domain effect as well: a linguistic advantage over nonlinguistic material, which was driven by the domain effect in the auditory modality. Overall, the auditory linguistic condition had an advantage over other modality-domain types. Training types did not have any overall effect on learning; starting big enhanced performance only in the case of serial visual presentation. These results show that input characteristics such as modality, presentation type, domain and training type influence statistical learning, and suggest that their effects are also dependent on the specific stimuli and structure to be learned.

Subject terms: Psychology, Human behaviourIntroduction

Our surroundings are full of structured patterns and regularities. In order to efficiently operate in this complex environment, an organism has to be equipped with abilities to find, learn, and utilize these environmental structures and regularities. Statistical learning is a powerful mechanism of extracting and encoding structure from environmental stimuli1. This form of learning is ubiquitous in human cognition: studies have shown that it is present in the auditory, visual, and tactile modalities and across the linguistic and nonlinguistic domains2–12, and it also operates in multimodal, visuomotor tasks13,14, as well.

Statistical learning supports many skills in our everyday life. For instance, language, consisting of complex patterns and regularities on multiple levels, has been suggested to rely on it15–25. While the contribution of statistical learning might be most frequently highlighted in language, several results have shown that this mechanism is not limited to it: it has an important role in domains such as music acquisition26, event processing27, or acquiring complex visual stimuli like scenes or faces28, suggesting a diverse and varied role for statistical learning in human cognition. As the human cognitive system faces great diversity in learning materials, differences in the properties of the input may impose different constraints on statistical learning in each area1,29. Input constraints are especially important in statistical learning because this form of learning is model-free and input-driven compared to other forms of learning like reinforcement learning or declarative learning29. To understand how this fundamental mechanism operates in different areas of cognition, we aim to uncover how input characteristics and their interactions affect learning.

While the variability of areas where statistical learning is present may suggest generality, direct comparisons of learning the same structure with stimuli from different domains and modalities indicate the presence of modality and domain specific constraints. (In the present paper, we use domain to refer to the content of representations, more specifically, to denote the linguistic-nonlinguistic distinction in our tasks). These effects have mostly been demonstrated in artificial language learning tasks, where a few novel items are organized into sequences based on simple patterns. After being exposed to a set of grammatical sequences, humans are able to distinguish grammatical from ungrammatical sequences. These studies have shown that the efficiency of statistical learning of serially presented (i.e., presenting one stimulus after the other) nonlinguistic auditory patterns exceeds the extraction of serial nonlinguistic visual patterns, which in turn is better than learning serial nonlinguistic tactile patterns3. In general, statistical learning is assumed to have modality- or even stimulus specific characteristics1,29.

Importantly, these modality effects are likely to result from differences in parameters of optimal presentation. While sequential information in the auditory modality is only available through serial presentation of stimuli, for visual sequences, serial and simultaneous presentation are both feasible. Simultaneous presentation, where items of a sequence are presented together at the same time, seems to be the optimal in the case of statistical learning of nonlinguistic visual sequences30,31. When visual information is presented simultaneously, performance is similar to nonlinguistic auditory learning30,31. Presentation rates also affect learning differently in different modalities, and slower rates seem to facilitate visual statistical learning: when presentation rate is slower with serial linguistic visual than with serial linguistic auditory stimuli, learning performance is equivalent between the modalities69.

While modality differences in statistical learning have been demonstrated in several studies, tests of domain effects, e.g. direct comparisons of linguistic versus nonlinguistic materials are hard to find. One notable exception is Saffran31, who explored both domain (linguistic versus nonlinguistic) and modality (auditory versus visual) effects in an artificial grammar learning task and found no overall advantage of sequence learning in the linguistic over the nonlinguistic domain (or in the auditory over visual modality) with serial presentation of sequences. However, the focus of this study was on contrasting two types of grammars (grammars with predictive and non-predictive dependencies) within each condition, instead of directly comparing performance across domains and modalities. Although it was not the primary focus of their studies, Hoch, Tyler and Tillmann70 directly compared statistical learning in the linguistic and nonlinguistic domains, observing significantly higher levels of learning in the linguistic than in the nonlinguistic domain.

Besides constraints by modality, presentation type and domain, different arrangements of stimuli during training (training type) also influence statistical learning. The starting small hypothesis assumes that incremental presentation of stimuli of different length (and complexity) enhances statistical learning in the case of humans and neural networks32. In another formulation, under the less is more hypothesis33,34, cognitive limitations like reduced working memory capacity, help learning complex patterns and systems. Research on human learners and less is more/starting small is methodologically diverse and has given controversial results35–38. On the one hand, contrary to the predictions of the less is more hypothesis, several studies found that the acquisition of grammar structures in artificial grammar learning tasks is more effective in adults than in children39. On the other hand, simulating reduced working memory capacity in adults seems to facilitate learning in some40, but not in other studies41. Starting big arrangement of stimuli, starting with the longer strings of the grammar, has also been argued and demonstrated to result in superior performance by allowing larger chunks to be learned first and be parsed later42. However, it may also lead to false hypotheses about grammar structure32,43, or prevent generalization of rules40. To summarize, starting small and starting big training types lead to more efficient learning in some cases, but further research is needed to test the conditions in which they boost learning.

Although effects of input characteristics like modality, presentation, domain and training type have been examined before, previous research only investigated these effects on statistical learning separately calling for further studies with direct comparisons. Furthermore, many studies used different statistical learning designs, with differences in patterns, stimuli and presentation arrangement. Our aims in this study were to examine modality, presentation, domain and training type effects using Saffran’s31 predictive grammar in order to extend the results of the original study (a) by systematically investigating all combinations of the examined input effects and their interactions (e.g. by also including visual linguistic conditions), and (b) by directly comparing learning performance across conditions. We compare the efficiency of statistical learning in visual (both in serial and simultaneous presentation types) and auditory modalities and across linguistic versus non-linguistic domains. We also wanted to test how training type, namely starting small and starting big influences learning across these conditions, as training effects have not been examined with finite state, category-based grammars. Figure 1 summarizes the design of experimental conditions in the study.

Open in a separate windowFigure 1

The design of the study. We systematically investigated the effect of four factors: modality (auditory vs. visual), presentation type (serial vs. simultaneous), domain (linguistic vs. nonlinguistic), and training type (random vs. starting small vs. starting big), yielding 18 conditions altogether. With 20 participants in each group, 360 participants took part in the study.

Our hypotheses were the following:

Based on previous findings, we expected the advantage of learning in the auditory modality over the visual modality with serial presentation. We also hypothesized that when presentation is optimized for modality, this advantage disappears, and performance in the serial auditory and in the simultaneous visual tasks would be on similar levels.In the present study, we aimed at directly comparing the acquisition of statistical patterns in the linguistic and nonlinguistic domains. Based on the results of Saffran31, we expected that the linguistic versus nonlinguistic status of stimuli would not have an effect on learning efficiency.Training effects, starting small and starting big, have not been examined with finite state, category-based grammars. We hypothesized that starting small would facilitate learning compared to presenting training sequences in a random order, as starting small enables the generation of simple and flexible hypotheses about the rule32,43. In contrast, starting big would mainly facilitate learning of specific item-relations, and as a result, we expected it would result in lower learning performance than random training due to less effective hypothesis generation42.

These hypotheses can be translated to the following formulations in our current experimental design, motivating three sets of analyses:

With serial presentation of stimuli, we expected the advantage of learning in the auditory modality in comparison to the visual modality. We hypothesized that there would be no domain effect, that is, the linguistic conditions would not differ from the nonlinguistic conditions. We also expected that starting small training would lead to higher, while starting big would lead to lower performance than presenting training sequences with a different length in a random order.In the visual conditions, we expected the advantage of simultaneous over serial presentation. In this case, we also expected no domain effect, and an advantage of starting small and disadvantage of starting big stimuli relative to random training.With presentation optimized for modality, we expected that performance in the serial auditory and in the simultaneous visual tasks would be on similar levels. Here, we also hypothesized no domain effect, and an advantage of starting small and disadvantage of starting big stimuli relative to random training.

MethodParticipants

360 young adults participated in the study. Most of them were university students who were recruited through facultative cognitive psychology courses at the Budapest University of Technology and Economics, and received course credit for their participation. The rest of the participants were volunteers who were recruited via convenience sampling. Inclusion criteria were normal or corrected-to-normal hearing and vision, and Hungarian as a native language. Participants were asked to report any neurological and psychiatric conditions (none were reported in our sample). Mean age was 22.5 (SD = 3.9, minimum = 18.1, maximum = 55.8), and 255 females and 105 males participated in the study. Age information was missing in the case of two participants. All participants were tested with their informed consent, in accordance with the principles set out in the Declaration of Helsinki and the stipulations of the local institutional review board (United Ethical Review Committee for Research in Psychology, ethical approval number: EPKEB-2018/87).

Stimuli

Throughout the conditions, stimuli varied by modality (auditory versus visual) and domain (nonlinguistic versus linguistic). For all conditions, our aim was to design diverse stimulus sets where individual stimuli are well discriminable from each other. For the auditory nonlinguistic conditions, we divided a frequency range that was conveniently perceivable through our laboratory headphones (220–831 Hz) into 15 equal sections following steps of the musical scale to obtain 16 tones. As a result, we obtained intervals larger than standard semitones, and almost as large as standard whole tones (220 Hz, 240 Hz, 263 Hz, 287 Hz, 314 Hz, 343 Hz, 374 Hz, 409 Hz, 447 Hz, 488 Hz, 534 Hz, 583 Hz, 637 Hz, 696 Hz, 761 Hz, 831 Hz). Each tone was 470 ms long. For the auditory linguistic conditions, we used Hungarian CVC nonwords compiled from diverse Hungarian phonemes to promote discriminability (bif, dők, dup, gal, hep, kav, lam, lor, mib, neb, péf, rász, rud, szig, tez, sot). Note that some of the nonwords are four characters long as they include phonemes with a digraph (two-character grapheme) equivalent (‘sz’). Nonwords were recorded from a Hungarian female speaker, and the average length of syllables was 470 ms. In the visual nonlinguistic condition, 16 meaningless symbols were used that were rich in detail, and easily distinguishable from each other. In the visual linguistic conditions, the same syllables were used as in the auditory linguistic condition. Syllables were visually presented on white screen in black font. Individual items in each stimulus type were assigned to categories (as illustrated in Fig. 2), and the rules of the artificial grammar were defined over these categories.

Open in a separate windowFigure 2

Stimulus sets in different conditions by modality and domain. Stimuli in each condition are classified into categories (A, C, D, F, and G). The rules of the grammar are defined on these categories.

With the help of the grammar (given in Fig. 3, taken from Saffran31) and the condition-specific categorized vocabularies, we generated 58 three to five items long grammatical sentences and 32 two to three items long phrases for the learning phases (90 sequences altogether). Phrases were parts of grammatical sentences. We also generated 24 pairs of grammatical and ungrammatical sequences (9 four- and 15 five item long sequences) for the test phases in all conditions. Grammatical sentences followed the rules of the grammar, while the ungrammatical ones included a violation of one of the grammatical rules: (1) sentences must contain an AP phrase, (2) D words follow A words, while G words follow C words, (3) sentences must contain an F word, (4) CP phrases must precede F words. As a result, there were four violation types, one for each rule: (1) sentences starting with a BP phrase instead of an AP phrase, (2) sentences where D and G words were interchanged so G words followed A words and D words followed C words, (3) sentences where F words were exchanged for G words, (4) sentences where CP phrases or parts of the CP phrases were missing before F words. Each violation type was represented by six ungrammatical strings. Members of categories were randomly distributed in sentences in the case of each category. The full set of training and test sequences together with their statistical properties for the linguistic conditions are included as supplementary material online. Sequences of the nonlinguistic conditions were parallel to those of linguistic conditions, that is, each syllable corresponded to a pure tone and a symbol, respectively. The modality and domain variations on conditions only differed in their stimulus set.

Open in a separate windowFigure 3

Rules of the artificial grammar (from Saffran31). Letters A, C, D, F and G refer to categories, which each include a set of items (tones, syllables, or symbols in the different conditions). Items in each category were randomly distributed in sentences. A sentence consists of an “AP” phrase, a “BP” phrase, and an optional “CP” phrase. An “AP” phrase is made of an “A” category item and an optional “D” category item. A “CP” phrase consists of a “C” and an optional “G” item. A “BP” phrase is made of a “CP” phrase and an optional “F” item.

Procedure

Participants were tested in a silent room in groups of two or three. The testing was administered using E-Prime 2.0 Professional. The test administration took cca. 15 min, and consisted of a training phase and a test phase for all conditions.

In the auditory conditions, items were presented with no pauses between them. In the visual conditions, we applied two presentation types: in the serial conditions, one item was presented at a time on the center of the screen for 800 ms, followed by the next item with no pauses, while in the simultaneous conditions, all items of a sentence were presented together on the screen at the same time. (Pilot data from our lab on a simpler segmentation task showed no learning effect in the case of visual statistical learning when stimulus timing was matched to that of acoustic statistical learning and set to 470 ms. This was one of the reasons for using a longer presentation time: we wanted to avoid floor effects in a more complex task in the visual modality. Choosing longer presentation times was also motivated by earlier studies showing that longer presentation times in visual statistical learning indeed promote learning69,76,77. Since visual presentation rates vary between 400 and 1200 ms in the literature, we decided to go with a mid-range 800 ms that was significantly longer than what we used in our pilot studies.) Presentation time was adjusted to sequence length (the number of items times 800 ms). During the training phase, participants were instructed to simply attend to the presented sequences.

In all combinations of modality, presentation and domain, we examined the effects of three different training types. All conditions presented the same set of sequences; small and big were not defined in absolute terms, they refer to the relative length of sentences within the same training set. In the random conditions, sequences of different length were presented in a random order; the starting small conditions involved incremental presentation of sentences ordered by length, starting with the shortest sequence; the starting big condition was the reverse of the starting small condition, starting with the longest sequences and gradually proceeding towards the shortest ones. It is important to point out that the shortest strings were not full sentences of the grammar, but they were structural units (phrases) of the language.

In the two-alternative forced choice (2AFC) test phase, participants were told that the sequences presented before were in an unknown language, and then they were presented with 24 sequence pairs of a grammatical sentence and a sentence containing a violation in each of the 24 trials. The grammatical-ungrammatical order within the sequence pair was counterbalanced across the trials. The order of the trials was random, but sentence-pairs were preset. Participants were instructed to choose the one which was more similar to sentences of the unknown language in the training phase and indicate it by pressing the corresponding key (‘1’ for the first sentence and ‘2’ for the second). The two sentences followed each other with 2000 ms pauses. Higher than chance scores (choosing the grammatical member of the pair significantly more than 50% of the time) was taken as evidence of learning.

Results

Data were analyzed and visualized using IBM SPSS Statistics 20, JASP version 0.15.0.078 and the R package ggplot2, version 3.3.544. Descriptive statistics of accuracies in the 2AFC task are displayed in Table ​Table11.

Table 1

Descriptive statistics of groups in different modality, presentation, domain and training conditions.

ModalityPresentation typeDomainTraining typeMean accuracy (SD)Age in years (SD)Females/malesAuditorySerialNonlinguisticRandom0.58 (0.11)**21.16 (2.24)13/7Starting small0.58 (0.09)***20.66 (1.37)13/7Starting big0.56 (0.11)*20.42 (1.47)19/1LinguisticRandom0.68 (0.10)***22.13 (2.30)15/5Starting small0.75 (0.13)***21.92 (1.90)17/3Starting big0.71 (0.12)***21.91 (2.92)15/5VisualSerialNonlinguisticRandom0.59 (0.13)**26.45 (7.35)12/8Starting small0.53 (0.13)23.69 (2.45)12/8Starting big0.65 (0.16)***21.68 (2.53)16/4LinguisticRandom0.58 (0.13)*24.82 (7.74)12/8Starting small0.51 (0.12)24.46 (2.21)13/7Starting big0.59 (0.17)*22.38 (1.91)15/5VisualSimultaneousNonlinguisticRandom0.68 (0.16)***20.91 (1.19)13/7Starting small0.70 (0.11)***21.28 (0.95)16/4Starting big0.62 (0.14)***20.59 (1.07)18/2LinguisticRandom0.66 (0.13)***20.81 (2.19)16/4Starting small0.65 (0.13)***22.96 (3.95)11/9Starting big0.62 (0.14)**26.04 (5.90)9/11Open in a separate window

Descriptive statistics of the 2AFC accuracy task in groups in different Modality, Presentation Type, Domain and Training Type conditions. Differences from chance level (0.5) are calculated with one-sample t-tests, *: p 



【本文地址】


今日新闻


推荐新闻


CopyRight 2018-2019 办公设备维修网 版权所有 豫ICP备15022753号-3